Robust and Consistent Sampling
نویسندگان
چکیده
منابع مشابه
Consistent Weighted Sampling
We describe an efficient procedure for sampling representatives from a weighted set such that for any weightings S and T , the probability that the two choose the same sample is equal to the Jaccard similarity between them: Pr[sample(S) = sample(T )] = ∑ x min(S(x), T (x)) ∑ x max(S(x), T (x)) where sample(S) is a pair (x, y) with 0 < y ≤ S(x). The sampling process takes expected computation li...
متن کاملConsistent Subset Sampling
Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large universe U such that the elements in S satisfy a suitably chosen sampling condition. Given a subset I ⊆ U it should be possible to quickly compute I ∩ S, i.e., the elements in I satisfying the sampling condition. Consistent sampling has important applications in similarity estimation, and estima...
متن کاملConsistent Robust Regression
We present the first efficient and provably consistent estimator for the robust regression problem. The area of robust learning and optimization has generated a significant amount of interest in the learning and statistics communities in recent years owing to its applicability in scenarios with corrupted data, as well as in handling model mis-specifications. In particular, special interest has ...
متن کاملImproved Consistent Weighted Sampling Revisited
Min-Hash is a popular technique for efficiently estimating the Jaccard similarity of binary sets. Consistent Weighted Sampling (CWS) generalizes the Min-Hash scheme to sketch weighted sets and has drawn increasing interest from the community. Due to its constant-time complexity independent of the values of the weights, Improved CWS (ICWS) is considered as the state-of-the-art CWS algorithm. In ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Signal Processing Letters
سال: 2009
ISSN: 1070-9908,1558-2361
DOI: 10.1109/lsp.2009.2023481